A Comparison of the Performance of SaP::GPU and Intel’s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

نویسندگان

  • Ang Li
  • Omkar Deshmukh
  • Radu Serban
  • Dan Negrut
چکیده

SaP::GPU is a solver developed in the Simulation Based Engineering Lab (SBEL) [1] to solve large banded and sparse linear systems on the GPU. This report contributes the performance comparison of the banded solver of SaP::GPU and Intel’s Math Kernel Library [2] on a large set of synthetic problems. The results of several numerical experiments indicate that when it is used in conjunction with large dense banded matrices, SaP::GPU is two to five times faster than the latest version of the MKL dense solver when the latter is run on the Haswell, Ivy Bridge, or Phi architectures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

We discuss an approach for solving sparse or dense banded linear systems Ax = b on a graphics processing unit (GPU) card. The matrix A ∈ RN×N is possibly nonsymmetric and moderately large, i.e., 10, 000 ≤ N ≤ 500, 000. The split and parallelize (SaP) approach seeks to partition the matrix A into diagonal subblocks Ai, i = 1, . . . , P , which are independently factored in parallel. The solution...

متن کامل

SPIKE::GPU A SPIKE-based preconditioned GPU Solver for Sparse Linear Systems

This contribution outlines an approach that draws on general purpose graphics processing unit (GPGPU) computing to solve large linear systems. To methodology proposed relies on a SPIKE-based preconditioner with a Krylov-subspace method and has the following three stages: (i) row/column reordering for boosting diagonal dominance and reducing bandwidth; (ii) applying single precision truncated SP...

متن کامل

Toward a High Performance Tile Divide and Conquer Algorithm for the Dense Symmetric Eigenvalue Problem

Classical solvers for the dense symmetric eigenvalue problem suffer from the first step involving a reduction to tridiagonal form that is dominated by the cost of accessing memory during the panel factorization. The solution is to reduce the matrix to a banded form, which then requires the eigenvalues of the banded matrix to be computed. The standard D&C algorithm can be modified for this purpo...

متن کامل

On the performance and energy efficiency of sparse linear algebra on GPUs

In this paper we unveil some performance and energy efficiency frontiers for sparse computations on GPU-based supercomputers. We compare the resource efficiency of different sparse matrix–vector products (SpMV) taken from libraries such as cuSPARSE and MAGMA for GPU and Intel’s MKL for multicore CPUs, and develop a GPU sparse matrix–matrix product (SpMM) implementation that handles the simultan...

متن کامل

A scalable approach to solving dense linear algebra problems on hybrid CPU-GPU systems

Aiming to fully exploit the computing power of all CPUs and all GPUs on hybrid CPU-GPU systems to solve dense linear algebra problems, we design a class of heterogeneous tile algorithms to maximize the degree of parallelism, to minimize the communication volume, as well as to accommodate the heterogeneity between CPUs and GPUs. The new heterogeneous tile algorithms are executed upon our decentr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015